Rhetorical Structure Modeling for Lecture Speech Summarization
نویسندگان
چکیده
We propose an extractive summarization system with a novel non-generative probabilistic framework for speech summarization. One of the most under-utilized features in extractive summarization is rhetorical information -semantically cohesive units that are hidden in spoken documents. We propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode this underlying structure in speech. We show that RSHMMs give a 68.67% ROUGE-L F-measure, a 6.44% absolute increase in lecture speech summarization performance compared to the baseline system without using RSHMM. We further propose an enhanced Rhetorical-State Hidden Markov Model (RSHMM++) for extracting hierarchical structural summaries from lecture speech. We show that RSHMM++ gives a 72.01% ROUGE-L F-measure, a 3.34% absolute increase in lecture speech summarization performance compared to the baseline system without using rhetorical information. We also propose Relaxed DTW for compiling reference summaries.
منابع مشابه
Instructions for use Title Rhetorical Structure Modeling for Lecture Speech Summarization
We propose an extractive summarization system with a novel non-generative probabilistic framework for speech summarization. One of the most under-utilized features in extractive summarization is rhetorical information -semantically cohesive units that are hidden in spoken documents. We propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode this underlying structure in sp...
متن کاملAutomatic Segmentation and Summarization of Spoken Lectures
The ever-increasing number of online lectures has created an unprecedented opportunity for distance learning. Most online lectures are presented as unstructured text, audio and/or video files which make it di cult for students to locate relevant lectures and browse through them. In this thesis, we investigated several automatic lecture segmentation and summarization algorithms. Automatic lectur...
متن کاملA comparative study on speech summarization of broadcast news and lecture speech
We carry out a comprehensive study of acoustic/prosodic, linguistic and structural features for speech summarization, contrasting two genres of speech, namely Broadcast News and Lecture Speech. We find that acoustic and structural features are more important for Broadcast News summarization due to the speaking styles of anchors and reporters, as well as typical news story flow. Due to the relat...
متن کاملA Structure-based Method for Speech Summarization
This paper proposes a model and system for speech summarization, aimed at selectively listening to speci c contents in the entire lecture speech data. Our system uses both a target speech and its corresponding paper. Papers are used to identify contents where users are interested, based on structure/surface information. On the other hand, speech is e ective to deeply understand speci c contents...
متن کامل